Development of web-based dynamic nomogram to predict survival in patients with gastric cancer: a population-based study

Gastric cancer (GC) is the fifth most frequent malignancy worldwide and the third leading cause of cancer-associated mortality. The study’s goal was to construct a predictive model and nomograms to predict the survival of GC patients. This historical cohort study assessed 733 patients who underwent treatments for GC. The univariate and multivariable Cox proportional hazard (CPH) survival analyses were applied to identify the factors related to overall survival (OS). A dynamic nomogram was developed as a graphical representation of the CPH regression model. The internal validation of the nomogram was evaluated by Harrell’s concordance index (C-index) and time-dependent AUC. The results of the multivariable Cox model revealed that the age of patients, body mass index (BMI), grade of tumor, and depth of tumor elevate the mortality hazard of gastric cancer patients (P < 0.05). The built nomogram had a discriminatory performance, with a C-index of 0.64 (CI 0.61, 0.67). We constructed and validated an original predictive nomogram for OS in patients with GC. Furthermore, nomograms may help predict the individual risk of OS in patients treated for GC.


Materials and methods
Patients and data sources. The demographical and clinicopathological characteristics of 733 GC patients were extracted from a tertiary University-Hospital of Iran, Taleghani Hospital in Tehran, between 2013 and 2020. This research complies with the principles of the Helsinki Declaration. We obtained the patients' informed consent to be allowed to use their medical information. The methods were carried out in accordance with the relevant guidelines and regulations. The Ethics Committee of Iran University of Medical Sciences approved the study (Ethical code: IR.IUMS.REC.1399.122).
Demographic and clinical variables. Survival time, based on months elapsed from the cancer diagnosis until death, was considered the outcome variable. The demographical and clinical variables, including sex, marital status, smoking status, body mass index (BMI), family history, type of treatment, grade of tumour, depth of tumour, number of involved lymph nodes were predictors. The patients' survival status was collected based on alive or dead.
Statistical analysis. The continuous variables were described as mean ± SD. Also, the frequency and percentage of categorical variables were reported. Missing data were imputed by fully conditional specification 25 . The Kaplan-Meier was used in order to estimate the survival function. We applied a univariable CPH model to explore the relationship between a patient's survival and explanatory variables. The selected variables with P < 0.2 in the univariable analysis were subjected to multivariable regression modelling. Then, the nomogram was illustrated according to the multivariable CPH model. At last, C-index, as a global index for validating the predictive ability of a survival model, and time-dependent area under the roc curve were calculated to assess the internal validation. Also, internal calibration using bootstrap resampling was assessed by plotting the predicted probabilities from the model versus actual survival probabilities. The analysis was perform using the SPSS 23 and Stata 11. The survival, DynNom, rms, and hdnom packages in R 4.1.0 software were used to create a dynamic nomogram and to perform validation and calibration. Additionally, decision curve analysis (DCA) was applied with the function of "dcurves". When the net benefit of a model is greater than curing in both all and none group of patients, the model can be considered as a clinical utility. The decision curve model can be compared with serious cases that curing all patients or none. If a model has acceptable level of advantage in a wide range of clinically reasonable preferences, the model or test can be advised.

Results
The study population consisted of 733 confirmed patients with GC who underwent treatment. The median of follow-up time is 9.55 months (IQR = 4-19.13, range 0.  Figure 1 revealed the Kaplan-meier according to American Joint Committee on Cancer (AJCC) staging, which the number at risks were listed below.
The results of the multivariable CPH were presented in Table 2. Variables that had P < 0.2 in the univariable analysis were candidates for the multivariable regression analysis. The table showed that age of diagnosis, BMI, grade of the tumour, and depth of tumour are significant in the multivariable CPH model (P < 0.05).
The result showed that for every 10 years of increasing the age, the hazard rate increases by 10% (HR = 1.01, P value < 0.05). The HR in patients with the overweight range was 46% less the than normal group (HR = 54%, P < 0.05); however, obese patients had higher HR than normal weight, which is non-significant (HR = 1.2, P = 0.518).
Also, the hazard ratio in patients who underwent chemotherapy, radiotherapy, and immunotherapy, presented as other treatments in the table, is 33% more than the people who had surgery; however, the type of treatment was non-significant (HR = 1.33, P = 0.057). Moreover, HR of tumour grade in patients with undifferentiated tumour grade was 57% more than people with well grade (HR = 1.57, P < 0.05). When the depth of the tumour   www.nature.com/scientificreports/ deteriorated, the HR was soared significantly in GC patients. Thus, the higher the tumour depth, the higher HR (P < 0.05).
The results of the multivariable CPH model were presented as a nomogram in Fig. 2. The probability of survival for a GC patient can be predicted at a specific time point using this nomogram. The patient's characteristics have been plotted on each variable axis. To predict the survival probability of a patient, a vertical line is drawn from the patient's characteristics value to the top points scale. In this way, the number of points that were assigned by that variable value is determined. Then, the points from each variable value are summed. Finally, the sum on the total points is vertically projected onto the bottom axis, and a personalized probability survival time is obtained. Figure 3 shows the image of a web-based nomogram which is accessible in the https:// nbshi ny. shiny apps. io/ Gastr icDyn Nom/. This is very simple-to-use web-based nomogram for convenient application, which can aid personalized treatment and clinical decision-making. This dynamic nomogram considers the sliders for covariates variable, bounded on the observed ranges, and drop-down boxes for categorical ones.
Internal validation and calibration. The internal validation was checked using C-Index and timedependent AUC at evaluation time points. The C-index was calculated as 0.64 (CI 0.61, 0.67) also, we validate the performance of the CPH model with bootstrap resampling every year from the first year to the sixth year. In addition to, the C-index of the presented model, 0.64, was slightly less that of the AJCC clinical staging 0.68. The time-dependent AUC at 1, 2-, 3-, 4-, 5-and 6-years follow-up have been presented in Fig. 4A. Based on DCA, if the threshold probability be > 0.45, the developed nomogram is superior in predicting survival in all of the patients (Fig. 4B). In addition, the internal calibration using bootstrap resampling was assessed by plotting the predicted probabilities from the model versus actual survival probabilities. In this way, the samples were split into ten risk groups, and the survival probabilities at 1 and 2 years were obtained and summarized as calibration plots in Fig. 4C.

Discussion
This study provided a significant contribution through the use of a historical cohort of patients with GC who were treated in Iran from 2009 to 2020. As far as is known, this is the first study of nomogram in GC patients of Iranian population, known as a user-friendly clinical tool with an acceptable sample size and long-term follow up. In our study, we applied a web-based nomogram that can be used to predict the survival probability. The multivariable CPH model presented that age of diagnosing, BMI, family history, type of treatment, grade of tumour, and depth of tumour were statistically significant. Furthermore, we construct a nomogram to predict OS, which could provide individualized estimates of potential survival and aid individualized management decisions for GC. The C-index, calculated as 0.64 (CI 0.61, 0.67), was applied to evaluate the model internal validation, and Nomogram is a precise and useful clinical tool that can help clinicians predict the probability of an outcome event, that is, survival time. A variety of nomograms have been built to predict the therapeutic benefits, the postoperative survival rate in patients with GC 26-28 . Mu et al. predicted the long-term survival of 421 GC patients, who underwent D2 radical lymphadenectomy, using survival model and establish a nomogram 27 . They calculated C-index of the model that was 0.76 for internal verification. Their significant factors were tumour staging, location of tumour, BMI, neural and vessel invasion. In our investigation, the age, grade of tumour and depth of tumour were considered as the main factor in multivariable CPH; also, the C-index was calculated 0.64. A study was done by Han et al. to predict survival after D2 gastrectomy for GC patients 29 . The C-index for OS was 0.69, and also, they established a nomogram predicting 5-and 10-year overall survival after D2 gastrectomy for gastric cancer. Also, another gastric cancer study multivariable Fine and Gray regression model to predict disease-specific mortality (DSM) that considered competing risks 30 . The goal of the study was to progress the first pre-treatment gastric cancer nomogram for predicting DSM that represented a acceptable discrimination in the new nomogram. Their result showed that the newly advanced nomogram perfectly predicted DSM, which can be used for patient advising in medical practice. In this study, their C-index of the model was 0.887 as well as the AJCC clinical staging 0.794. However, in our study the C-index of our model, equal to 0.64, was slightly less that of the AJCC clinical staging 0.68.
Here, we constructed a nomogram to predict the survival rate in GC patients. According to previous studies, a C-index > 0.6 indicated that the built model had an acceptance accuracy 29,31,32 . The value of this index was consistent with our study. In general, a few studies have applied AUC to predict the OS 26,28,33 . The AUC values of ROC were more than 60% for 1-, 3-, and 5-year of survival, which are compatible with our study. In addition, DCA was drawn to evaluate the clinical application value of the nomogram 28,[30][31][32]34 . Lu et al. used a nomogram to predict recurrence-free survival and the advantages of adjuvant chemotherapy after radical resection in high stage GC patients 31 . They applied CPH model to identify predictive factors for RFS; moreover, established a novel nomogram for GC after radical resection.
Our multivariable CPH regression model discovered that age of diagnosis, BMI, grade of the tumour and tumour depth were independent risk factors in GC. Most of the previous studies focused on independent variables associated with GC and found that tumour depth, differentiation grade, size, and lymphatic invasion were closely associated with patients' survival 35

Study limitation.
The key strength of this study is the long-term follow-up period. The second strength of the study is to use web-based nomogram that any expert can calculate the overall survival probability. Also, we had several limitations. First, some variables, such as Helicobacter pylori infection status, location of tumour, demarcation line of tumour lesion, tumour markers, nutritional status, and Charlson Comorbidity Index, may also be potential risk factors in patients with GC and need to be incorporated into our model. Second, statistical analysis was performed using internal validation. It is suggested that in future studies, external validation can be performed using another test dataset. Third, the key limitation of the present survey was the small number of sample size in a center that recommend to larger sample size in similar studies. Forth limitation is to collect some variables, such as BMI and age in the form of continuous instead of categorical variable 37 .

Conclusion
We successfully established a novel nomogram using patient data from the GC database in Taleghani University-Hospital. Furthermore, the age at diagnosing, BMI, tumour grade, depth of tumour made a significant contribution in predicting OS of patients with GC.